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1 Introduction 

Consider a discrete finite source with N symbols, and with the probability 
distribution p := (ui, 112, ■ ■ ■ , u/v). It is well-known that the Huffman encoding 
algorithm pQ provides an optimal prefix code for this source. A Z?-ary Huffman 
code is usually represented using a D-ary tree T, whose leaves correspond to the 
source symbols; The D edges emanating from each intermediate node of T arc 
labeled with the D letters of the alphabet, and the codeword corresponding to a 
symbol is the string of labels on the path from the root to the corresponding leaf. 
Huffman's algorithm is a recursive bottom-up construction of T, where at each 
time the smallest D probabilities are merged into a new unit, and henceforth 
represented by an intermediate node in the tree. Throughout this paper, unless 
D is explicitly specified, we talk about the binary Huffman codes. 

Denote by l(u) the length of the path from the root to a node u on the 
Huffman tree T. Then the expected length of the Huffman code is defined as 

fc 

L(T) :=^«iJ(ui). (1) 

i—l 

Similarly, the entropy is defined as 

k 

H(T) := -J2uilog D ( Ui ). (2) 

i=l 

The Huffman encoding is optimal in the sense that no other code for distribution 
p can have a smaller expected length than L(T). 

The redundancy R(T) of the code is defined as the difference between the 
average codeword length L(T), and the entropy H(T) of the source. It is easy to 
show that the redundancy of the Huffman code is always non-negative and never 
exceed 1. These bounds on R(T) can be improved if partial knowledge about 
the source distribution is available. Gallagcr Johnsen [3], Capocelli and 
Desantis 0] and [5] , Manstetten |B] and Capocelli et al. [7] improved the upper 
bound on the redundancy (of binary Huffman codes) in terms of p\ := max; ui, 
the probability of the most likely source symbol. On the other hand, in |U 
and [H] the problem of upper bounding the redundancy have been addressed 
when pn := mini m, the probability of the least likely source symbol, is known. 
Furthermore Capocelli et al. |7j obtained upper bounds on R(T) when both 
extreme probabilities, p\ and pjy are known. Also in jS] and [5] upper bound as 
a function of two least likely source symbol probabilities is derived. 

'School of Computer and Communication Sciences, Ecole Polytechnique Federale de Lau- 
sanne, Switzerland. E-mail: {soheil .mohajer, payam. pakzad}Oepf 1 . ch 

' Department of Electrical and Computer Engineering, Isfahan University of Technology, 
Iran. E-mail: ali JsakhbodSec . iut . ac . ir 



1 



Figure 1: Decomposition of a Huffman tree around an intermediate node u. 

Johnsen P] presented a tight lower bound on R(T) in terms of p\ when 
Pi > 0.4. Subsequently, such lower bounds were presented by Montgomery and 
Abrahams ^U] f° r a U Pi- Golic and Obradovic extended Johnsen's lower 
bound on the redundancy of a D-ary Huffman code. The lower bound on i?(T) 
when only pjq is known, is considered in jjj. Furthermore, the problem of lower 
bounding R{T) for a binary code when the two lease likely probabilities are 
known, is discussed in 0] and [§]. 

Recently Ye and Yeung ^21 presented a simple upper bounds on R(T) in 
terms of the probability of any source symbol, as opposed to the case the least 
or the most likely probability is given. Using a complicated approach they 
conjectured the tight upper bound on R{T) for a source containing a symbol 
with a given probability p, without knowledge about its 'rank in the source 
distribution. In this paper we prove this conjecture with a simple approach 
and prove that this upper bound is tight. Similarly, we present a tight lower 
bound on R{T) for a source that contains a symbol with given probability p. 
We further describe all possible sets of distribution which achieve this lower 
bound. We show that simple extentions of our results lead to the lower bound 
on the redundancy when either p\ |1(J| or p^v 01 are known. We also extend 
our proof to the Z?-ary Huffman codes and find the tight lower bound on R{T) 
when probability of any symbols is known. 

2 Preliminaries 

In this section we present some definitions and results that will be useful in the 
rest of the paper. 

Let T = T(p) be a binary Huffman tree for a source with probability distri- 
bution p = (ui, U2, ■ ■ ■ , mat)- For the rest of this paper, we identify each node 
of a Huffman tree by the probability of the corresponding symbol or unit; this 
is defined as the sum of the probabilities of all the leaf symbols lying in the 
sub-tree under that node. For each intermediate node u of T, let A u denote the 
sub-tree of T under u, and denote by u" 1 * A u its normalized version, where 
the probability of each node is scaled by a factor of 1/u, so that the scaled leaf 
probabilities sum to one. Therefore, u^ 1 * A u itself is a Huffman tree for a 
source with probabilities which are given on the leaves of * A u ). Similarly, 
denote by A u the Huffman tree on a source with probabilities which are the 
same as p, except that all the leaf probabilities of T under u are merged as 
a single symbol with probability u, which appears as a leaf of A u . Then it is 
easy to see that A u corresponds to the sub-tree of T appearing 'above' u. See 
Figure n for a schematic diagram of the relationship between A„ and A„ with 
the original Huffman tree T. 

The following lemma relates the redundancy of a Huffman tree to the re- 
dundancies of the sub-trees (u^ 1 * A u ) and A u . 
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Lemma 1. For any intermediate node u in a Huffman tree T, we have 

R(T) = R(A u )+uR( U - 1 *A u ) (3) 

Proof. The average length of any Huffman code equals the sum of the proba- 
bilities on the intermediate nodes in the corresponding tree. Each intermediate 
node of T is either an intermediate node in A u or in (u^ 1 * A u ), where the 
probabilities in the latter tree need to be scaled back by a factor of it. Therefore 
we have 

L(T)=L(A u )+uL(u- 1 *A u ). (4) 

Similarly, decomposing the leaf nodes of T in terms of those on A. u and on 
(u^ 1 * A u ), we get 

H(T) = - J2 xl °S x 

rrelcaf(T) 

= — x log x — u y log u y 

a;eleaf(Au)\{u} yGlcaf («~ 1 *A U ) 

= — I x log x — u log u J — ( u y log y + u log u ] 

\:celeaf(A„) / V j / elcaf(u- 1 *A„) / 

= H(A u )+uH{u- 1 *A u ) (5) 
and the desired result will be obtained immediately from and 10). ■ 

3 Upper Bound 

In this section we provide a tight upper bound for the redundancy of the Huff- 
man code for a source containing a symbol with a given probability p. Note 
that p need not be the probability of the most or the least likely symbol. 

Theorem 1 (Tight Upper Bound on Huffman Redundancy). Consider 
the Huffman code for a source with finite alphabet, which includes a symbol 
with probability p, but is otherwise arbitrary. The redundancy of this code is 
upper bounded by 

._ f 2-p-J?(p), if0.5<p< 1 
^W-j l+p-je<p), if0<p<0.5 w 

where J^ip) := — plogp — (1 — p) log(l — p) is the binary entropy function. Fur- 
thermore, this bound is tight, so there are sequences of sources whose Huffman 
redundancies converge to i? m ax(p)- 

Before we prove this theorem, we shall review some previous related results, 
which will be used in our proof. 

Our result improves the following bound obtained in |12| , and in fact proves 
a conjecture for the tightest upper bound given in the same paper: 

Theorem 2. Let p be the probability of any source symbol. Then the redun- 
dancy of the corresponding Huffman code is upper bounded by 

r 2-p-Jf{p), if 0.5 <p< 1 

-Rupperbound(p) := < 0.5, if 7T < p < 0.5 (7) 

[ l+ P -Jf(p), ifp<7T 

where ttq ~ 0.18 is the smallest root of equation 1 + p — J^(p) = 0.5. 
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Figure 2: Upper bounds of Theorems ^ [5] and [31 on the redundancy of a source 
containing a symbol with probability p. 



We will skip the proof here and refer the reader to the original paper ^5] • 
This upper bound is tight when p>0.5orp<7r ~0.18, but as also suggested 
in > it is not tight for the central region 0.18 < p < 0.5. Thus, in our proof 
of Theorem ^ we will only consider this central region and obtain a tight bound 
for the redundancy. 

We will also use the following upper bound on the redundancy of a source 
whose most likely symbol probability is known. A more precise form of this 
bound is included in and we refer the interested readers to that work for 
details and proof. 

Theorem 3. Let p± be the probability of the most likely symbol of a source. 
Then the Huffman redundancy of this source is upper bounded by the following 
function: 

( 2-p 1 -Jt*(j> 1 ) if 0.5 < Pl < 1 
f(pi) = l 3 - 5pi - JT(2 Pl ) ifvn<pi<0.5 (8) 

[7 if Pi < TTl 

where 7 = i? max (|) = 1 + 1/3 - ^(1/3) ~ 0.415, and m ~ 0.491 is a root of 
3 - 5pi - ^(2pi) = 7, see Figure^ 

Finally we will need the following lemma in our proof of Theorem ^ 

Lemma 2. Let p\ be the probability of the most likely letter in a source, 
which also contains another letter with probability q. If p\ > w± ~ 0.491 and 
q > n ~ 0.18, then l(pi) = 1, i.e. the length of the codeword corresponding to 
the most likely symbol is 1. 

Proof. We first note that as long as pi > |, the length l{p\) cannot be larger 
than 2; otherwise, there would be at least two independent intermediate nodes 
x and y on the Huffman tree with lengths less than p\ . Therefore x and y both 
would have probabilities at least as large as p\ ; but this is a contradiction since 
pi + x + y > 3pi cannot exceed 1. 
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Figure 3: A Huffman tree with l(pi) = 2. 



Suppose next that l(pi) = 2. Then there can be no codeword of length 1 in 
the code, since p\ is the largest probability. Therefore, the corresponding tree 
has a structure as in Figure 

Note first that v\ + v- 2 > pi since l((v\ + v 2 )) = 1 < l{pi) = 2. We will 
next show that u > q. Clearly, if q lies in the sub-tree under u, the assertion 
follows. Suppose then that q lies in the sub-tree under v\. Note next that, by 
the construction of the Huffman tree, cither pi,u > vi,i>2, or p\, u < v\, v 2 . The 
second alternative cannot happen, since then 1 > P1+W1+U2 > 3pi > 37Ti ~ 1.47 
would be a contradiction. Therefore u > v\ > q. 

Combining these relationships we get 

7I"1 < Pi < Vl + V2 = 1 — Pi — U < 1 — Pl — q < 1 — 7Tl — 7To, 

which is a contradiction, since 7Ti ~ 0.491 and 1 — ttq — tti ~ 0.329. ■ 

Proof of Theorem^ As stated before, when p < ttq ~ 0.18 or p > 0.5 our 
bound coincides with the bound of Theorem [21 It remains to show that for 
7To < p < 0.5, the redundancy of a Huffman code for a source which contains a 
symbol with probability p cannot be larger than R max (p). 

We prove this claim using an argument on p\ , the probability of the most 
likely symbol in p. First note that, if p = p\ is the most likely symbol, then 
from Theorem |3 R(T) < f(p) < R max (p)- Suppose then that p < p\. Clearly 
Pi < 1 — p since p and p\ arc probabilities in the same distribution p. Once 
again, from Theorem |3| if p\ < tti , then the Huffman redundancy of p cannot 
exceed R max (p) for any p, since R max (p) > flmax(|) > f(p) for all p < 7Ti. 

Suppose then that pi > 7Ti . Let T be a Huffman tree for p, which contains 
p > ttq as a leaf. Then by Lcmma[2]wc have l(pi) = 1. Thus p should appear in 
the sub-tree under the intermediate node of probability 1 — p\, i.e. p G Ai_ Pl 
and hence q := ^ S (1 — pi)^ 1 * Ai_ Pl . Then by Lemma^ we have 

R(T) = 1 - Jf(pi) + (1 - Pl )R((l - Pl )- X * A (1 _ pi) ) (9) 

We then upper bound the term R((l — Pi)^ 1 * An_ pi )) using Theorcm|21 Note 
that 7To < q < 1, since p was assumed to be greater than ttq. We consider the 
following two cases on the possible values of q = 1 ^ p pi : 

• Case 1: tt < q < 0.5. From TheoremEJ R((l - pi)^ 1 * A (1 _ pi) ) < \. 
Then using JHJ we get 

R(T) < 1 - + (10) 



The right-hand-side of the above inequality is a convex function of p\ 6 
(7Ti, 1 — 7To) and it is easy to verify that it takes its maximum value at the 
boundary point p\ = 1 — ttq ~ 0.82. Then we have 

R(T) < max ( 1 - JPfa) + (1 ~ 



PlG(7ri,l— 7T ) \ 2 

= (l-^K) + y) -0.410 
< min i? max (p) 

pe(7r ,0.5) 

= i? max (i) ~ 0.415 

• Case 2: 0.5 < g < 1. Define f3 := 1 - q, so that J^(/3) = (g). Then 
once again, using the upper bound of Theorem |21 in @ we get 

R(T) < 1 - JT(pi) + (1 - Pl)i?uppcrbou„d(g) 

= l_jr(p 1 ) + (l-p 1 )(2-g-J« 9 (g)) 

< l-jr(p) + (l-p 1 )(2-g-je 9 C9)) 

= l-JT(p)+p+(l-pi)(2-2g-J« 9 (g)) 
= iU(p) - (1 - Pi) - W) 

< Rmaxip) 

where in the third line we have used the fact that Jif{pi) > Jf(p) since 
p < pi < 1 — p, and the last inequality follows from the fact that (J^((3) — 
2/3) > for < (3 < 0.5. 

Thus we have shown that R(T) is upper bounded by R ma , x (p) for all p. It 
only remains to show the tightness of the bound. It is easy to check that the 
redundancy of a source with distribution p e := I (1 — e)(l — p),p, e(l — p) ) is 



R(p e ) = 1 +p - Jf(p) - (1 -p)(JT(e) - e), (11) 

which tends to i? m ax(p) = 1 + p — J^(p) as e goes to zero. This completes the 
proof. ■ 



4 Lower Bound 

In this section we provide a lower bound for the redundancy of the Huffman 
code for a source that contains a symbol with a given probability p. As we will 
see, the redundancy can be zero only if p is dyadic, i.ep = 2~' for some integer 
I. We will also show that our bound is tight and describe all possible sets of 
distributions which achieve this redundancy. 

Theorem 4 (Tight Lower Bound on Huffman Redundancy). Consider 
the Huffman code for a source with finite alphabet, which includes a symbol 
with probability p, but is otherwise arbitrary. The redundancy of this code is 
lower bounded by 

Rmmip) ■= mp - Jf(p) - (1 - p) log (l - 2- m ) , (12) 

where m > takes either of the values [— logpj or [~— logp] which minimizes the 
expression. Furthermore, this bound is tight, i.e. there exist sources containing 
a symbol with probability p, whose Huffman redundancies equal i? m i n (p). 
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Figure 4: Canonical structure for minimum-redundancy Huffman trees that 
contain a symbol with probability p. 



Proof. We first note that, for the purposes of minimizing the redundancy, it 
suffices to only consider a simple class of probability distributions, which is 
depicted in Figure 0] To see this, let u be any intermediate node in a Huffman 
tree T, which does not contain p in the sub-tree under it, i.e. p A„. Then 
from Lemma ^ 

R{T) = R(A U ) + uR{u- 1 * A„) > R(A U ). (13) 

Therefore A„ is a Huffman tree containing a leaf with probability p, whose 
redundancy is at most equal to that of T. This argument can be repeated until 
T is converted into the form of Figure 0] 

Suppose then, without loss of generality, that T has this simplified form. 
Now let Xi = Oii ■ (1 — p), where a i = 1- We can write the following 

expressions for the expected length of the code and the entropy of the source: 



L(T) = Xi ■ i + pm 

i=l 

m 

= l + (m-l)p+(l-p)5^(i-l)ai 



and 



H(T) = -5^(l-p)ailog((l-p)ai)-plogp 

i=l 

= J4?(p) + (l-p)H(a 1 ,a>2,...,a m )- 



Thus, 



771 

R{T) = l + (m-l)p- jr(p) + (l-p)[5^(i-l)a i -F(a 1 ,a 2 ,... J am) 



i=l 



= l + (m-l)p- jr(p) + (l-p)/(ai,a 2) ...,a m ) (14) 

where f(a±, a%, . . . , a TO ) := X)I=i(* — l) a i — H(a±, a%, . . . , a m ). For each fixed 
length m of the tree, we wish to find the values of which minimize R(T). Note 
first that the minimizing probability vector (ai, . . . ,a m ) must be an interior 
point in the probability simplex, since in the case a m = 0, we can remove x m = 
from the distribution, — replacing m with (m — 1), — and lower the redundancy. 
Next note that R(T) depends on the ctj's only through /(•). Writing a m = 



7 



\ 




















A 


















- 




















/ - 






























(0.5,0.5) 










- 






















- 
- 


















- 
- 


- 














/ 




- 
























(0.182,0.0)9) 


(0.369,0.05) 

-"^^^t"~^— - 













0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 

P 



Figure 5: Lower bound and Upper bound on the redundancy of a source con- 
taining a symbol with probability p. 



1 — Y^Li a i' an d differentiating / with respect to a±, . . . , a m _i we get 
df 

- — = (i - 1 - m) + 1 + \oga t - 1 - log(l - a x a m -i) 

ooti 

= (t - 1) + log 



1 — a.\ — a m -i 

Setting J^- = for i = l,...,m-l we get 

_ 2 m ~ i 

1 — ax ot m -i 

The unique solution of the above system of equations is 

Q^m—i 



(15) 



2 m - 1 

Plugging these values into l|14|l and after straightforward manipulations, we 

get 

R(T) =mp- JT(p) - (1 -p) log (l - 2" 



This is readily seen to be a convex function of m. To minimize, we differentiate 
with respect to m and set the derivative equal to zero. 

yielding m = — log p. Since m needs to be an integer and by convexity, one of 
the two neighboring integers [~ logpj or |~— logp] will give the minimum. It 
remains to verify that the ai values of (|15|l are consistent with a Huffman tree 
of the form in Figure A necessary and sufficient condition for this is that 
p < x m -\ = (1 — p)u m -\. It is then easy to see that the chosen value of m in 
{ [— logpj , [— logp] } which minimizes (I12L results in an a m -\ coefficient which 
satisfies this condition. ■ 
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Remark 1. An alternative way to describe the minimizing value of m in Theo- 
rem 0] is as follows: The minimizing m satisfies 

Pm-l <P<P m (16) 

where ft := 1 and ft is given by 

(l + l/log (l + ^TTTa)) ■ 

This equation is obtained by equating the values of R(T) from l|12(l for two con- 
secutive integers. It is easy to see that ft is a descending sequence, converging 
to as k grows to infinity, so that for any p £ (0, 1) there exists a unique m 
satisfying JTSJ). The first few ft's are ft = 0.369, ft = 0.182, ft = 0.091, . . . 
and are displayed in Figure [3] 

Remark 2. From l|12|l . the lower bound R m - m {p) can be zero if and only if 
p = 2~ n is dyadic. In that case, from (|15J) Xi = (1 — p)cti = 2~' . Therefore the 
entire distribution is dyadic. 

Remark 3. The proof of Theorem ^ essentially describes all the source distri- 
butions that contain a symbol of probability p, and achieve the lower bound 
Rmin(p) on the redundancy. As argued, the Huffman trees for all such distribu- 
tions have a 'backbone' of form of Figured with probabilities that are uniquely 
determined by the theorem. Any such tree which extends beyond this unique 
backbone must satisfy the inequality of 1|13[) with equality, i.e. R(xi _1 * A Xi ) 
must be zero for all intermediate x^s. From Remark 2 above, this can happen 
only if the corresponding distributions for the sub-trees Xi~ x * A Xi are dyadic. 
Thus, all the distributions containing a symbols of probability p, which achieve 
the lower bound Rmin(p) can be obtained in the following way: 

Start with the backbone distribution described in Theorem 0] At any time, 
choose a leaf node other than p and split its probability in half. Each tree during 
this process is a Huffman tree with redundancy R m i n (j>). 

In the remainder of this section, we extend the results of Theorem 0] to the 
cases when the given probability p corresponds to the most, or the least likely 
symbol. 

Suppose first that p is constrained to be the probability of the most likely 
symbol of a source. Let p be a distribution as prescribed by Theorem which 
achieves the minimum redundancy i? m in(p), but in which p is not necessarily 
the maximum probability. Then, by the argument of the Remark 3 above, each 
symbol probability of p other than p can be successively split into two halves. 
We can repeat this process until p becomes the largest value in the distribution. 
Thus we have the following result: 

Theorem 5. (c.f. Theorem 2 in A tight lower bound for the Huffman 

redundancy of a source whose maximum symbol probability is p\ is -R m in(pi), 
as defined in Theorem 0] 

Next suppose that p < 0.5 is constrained to be the probability of the least 
likely symbol of a source. The argument used in the proof of Theorem0]can be 
extended to this case, since even without this additional constraint, we found 
that in order to achieve the minimum redundancy, p should have the maximum 
length in the canonical case of Figure We only need to apply the more 
stringent constraint 

P<x m = (l-p)/(2 m -l). (17) 

From the convexity of the function R{T) in (|14|) . the minimum will be achieved 
at one of the points on the boundary of the constraint set which are closest 
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Figure 6: Lower bound for the Huffman redundancy of a source containing a 
symbol with probability p. 



to the optimal m. Of the two neighboring integers to the optimal m, only 
m = [l°g -J satisfies the constraint of 1)170. On the other hand, satisfying the 
constraint (|17(l with equality corresponds to a Huffman tree in form of Figure 01 
in which x m = p. But, as argued in the remarks above, the redundancy of any 
such code is lower bounded by i? m i n (2p), since x m and p can be merged together 
with no loss in redundancy, and the resulting tree has a leaf with probability 
2p. Therefore the smallest achievable redundancy is the minimum of i? m ; n (2p) 
and the value of (|14)l for m = [log |J . Finally, it is easy to see that 

pLbg-J -jr(p)-(l-p)log(l-2-L lo ^J) < 

2 P [\og - J^(2p) - (1 - 2p) log (l - 2- Li°s £J ) . 

Thus we have proven the following: 

Theorem 6. (c.f. Theorem 2 in J^) A tight lower bound for the Huffman 
redundancy of a source whose maximum symbol probability is pjy, is the smaller 
of the following two functions: 

p|tog-J -^(p)-(l-p)log(l-2" Llog p J ), and (18) 
2 P riog^l-^(2p)-(l-2p)log(l-2-r io ^l). (19) 

Figure H3 plots the lower bounds of Theorems 01 and as a function of the 
fixed probability p. 

4.1 Extension to the D-ary Huffman Codes 

The method of Section 0] can be extended to obtain tight lower bounds for 
the D-ary Huffman codes. By using the same argument as in the binary case, 
we can show that the redundancy of a Huffman tree decreases by merging the 
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Figure 7: Canonical structure for minimum-redundancy D-ary Huffman trees 
that contain a symbol with probability p. 



symbols in the sub-tree under an intermediate node. Thus, for the purposes of 
minimizing the redundancy, it suffices to only consider Huffman trees with a 
structure as in Figured 

The following lemma shows further that in the minimum-redundancy struc- 
ture of Figured all the probabilities in the same level must be equal. 

Lemma 3. In any minimum-redundancy D-ary Huffman tree of given in Fig- 
ured wc must have Xij = x^i for all I = 1, . . . , m, and for all i, j = 2, . . . , D. 

Proof. If Xij 7^ Xjj for some i, j and I, then we replace all the probabilities in 
that level by their average xi := -grr J2k x k,U an d show that the new tree is a 
Huffman tree with strictly smaller redundancy. Note first that since the average 
is between the maximum and minimum probabilities in that level, the new 
probabilities are still consistent with the same Huffman structure, i.e. a^j+i < 
mhifc Xk,i < xi < max/j Xk.i < Xjj-\. Next note that the average length of the 
tree remains fixed after this process. The entropy of the original tree can be 
written as 

H{T) = h + (D- l)x t H D ( X2 ' 1 , . . . , XD <1 ), 

(D - l)xi [D - l)xi 

where ho is the contribution from the probabilities in the other levels, and 
Hrj(-) is the base-D entropy function, which is uniquely maximized with the 
uniform distribution. Therefore the entropy is maximized, — and the redun- 
dancy minimized, — with the proposed replacement Xij <— xi. I 

Using Lemma d an d an argument identical to the one in the proof of Theo- 
rem d w e get the following result: 

Theorem 7. The redundancy of a D-ary Huffman code containing a letter with 
probability p is tightly lower bounded by 

Rmin,D = mp- ,ye D {p) - (1 -p) log B (l - D- m ) (20) 

where m is cither [_— \og D p\ or |~— log^, p] which minimizes the above expres- 
sion, and J#b(p) ■— Jtf? (j>) / \og(D) is the D-ary entropy function. 



5 Conclusion 

In this paper we have obtained tight upper and lower bounds on the redundancy 
of the Huffman code for a source, for which the probability of one of the symbols 
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is known. Our upper bound proves a conjecture of |12| . and our lower bound 
extends and completes several earlier partial results. We have further discussed 
the explicit form of the distributions that achieve each of these bounds. Our 
arguments can be extended to the case of the D-avy Huffman codes, and some 
of these extensions arc included in this paper. 
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